Introduction to GPU Radix Sort
نویسندگان
چکیده
The prefix sum is the sum of all values in preceding locations in the sequence: in this case those to the left of the current location. In the case of the radix sort this means that the prefix sum computes the total count of all values less than the current value. For example, the prefix sum of location, and hence value, 2 is o2 = 3. This means there are 3 entries for 0s and 1s in the sequence. Thus, 3 is the destination address of the first 2 in the data set. The destination address of an element is the sum of the offset computed via the prefix sum and the index of the value in the set of the same value in the original array: the second 2 in the array would be at location 3 + 1. The elements are shuffled by calculating the destination address to get a sorted array.
منابع مشابه
GRS - GPU radix sort for multifield records
We extend the number sorting algorithms on the GPU to sort large multi-field records. We notice that traditional way of sorting the records by first sorting a (key, index) pair to obtain the sorted permutation of the records followed by actually rearranging the entire records to their final position might not actually be the most efficient way to sort them depending on the type of sorting algor...
متن کاملSorting On A Graphics Processing Unit(GPU)
2.1 Graphics Processing Units . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 2.2 Sorting Numbers on GPUs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48 2.2.1 SDK Radix Sort Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50 2.2.1.1 Step 1–Sorting tiles ...
متن کاملFast parallel GPU-sorting using a hybrid algorithm
This paper presents an algorithm for fast sorting of large lists using modern GPUs. The method achieves high speed by efficiently utilizing the parallelism of the GPU throughout the whole algorithm. Initially, GPU-based bucketsort or quicksort splits the list into enough sublists then to be sorted in parallel using merge-sort. The algorithm is of complexity n log n, and for lists of 8M elements...
متن کاملFast radix sort for sparse linear algebra on GPU
Fast sorting is an important step in many parallel algorithms, which require data ranking, ordering or partitioning. Parallel sorting is a widely researched subject, and many algorithms were developed in the past. In this paper, the focus is on implementing highly efficient sorting routines for the sparse linear algebra operations, such as parallel sparse matrix matrix multiplication, or factor...
متن کاملEnergy-Efficient Sorting on a Many-Core Platform
As processors move from multi-core to many-core architectures, opportunities arise for energy-efficient enterprise computations, such as sorting, on large arrays of processors. This paper proposes three different energy-efficient sorting methods for the first phase of an external sort simulated on a varying sized fine-grained many-core processor arrays used as a co-processor to an Intel CPU, wh...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011